Just Say No: Benefits of Early Cache Miss Determination
نویسندگان
چکیده
As the performance gap between the processor cores and the memory subsystem increases, designers are forced to develop new latency hiding techniques. Arguably, the most common technique is to utilize multi-level caches. Each new generation of processors is equipped with higher levels of memory hierarchy with increasing sizes at each level. In this paper, we propose 5 different techniques that will reduce the data access times and power consumption in processors with multi-level caches. Using the information about the blocks placed into and replaced from the caches, the techniques quickly determine whether an access at any cache level will be a miss. The accesses that are identified to miss are aborted. The structures used to recognize misses are much smaller than the cache structures. Consequently the data access times and power consumption is reduced. Using SimpleScalar simulator, we study the performance of these techniques for a processor with 5 cache levels. The best technique is able to abort 53.1% of the misses on average in SPEC2000 applications. Using these techniques, the execution time of the applications are reduced by up to 12.4% (5.4% on average), and the power consumption of the caches is reduced by as much as 11.6% (3.8% on average).
منابع مشابه
Just Say No: Benefits of Early Cache Miss Determinatio
As the performance gap between the processor cores and the memory subsystem increases, designers are forced to develop new latency hiding techniques. Arguably, the most common technique is to utilize multi-level caches. Each new generation of processors is equipped with higher levels of memory hierarchy with increasing sizes at each level. In this paper, we propose 5 different techniques that w...
متن کاملCache Performance in Java Virtual Machines: A Study of Constituent Phases
This paper studies the level 1 cache performance of Java programs by analyzing memory reference traces of the SPECjvm98 applications executed by the Latte Java Virtual Machine. We study in detail Java programs’ cache performance of different access types in three JVM phases, under two execution modes, using three cache configurations and two application data sets. We observe that the poor data ...
متن کاملQuantitative study of data caches on a multistreamed architecture
In this paper, we quantify the effect that fine grained multistreamed interaction of threads within a shared cache has on the miss rate. By concentrating on the miss rate, we focus on just the cache performance and separate ourselves from a given system architecture. We show the effects of cache capacity, associativity, and line size on the miss rates of multistreamed workloads of two, three an...
متن کاملMethod and apparatus for the selective scoreboarding of computation results
Statically scheduled machines do have a disadvantage when dealing with dynamic events, such as cache hit or miss detection. Early VLIW machines were designed without caches, to achieve predictability in memory access. However, such designs suffer in memory performance. To achieve high performance, VLIW architectures must have adequate support for using caches. A simple VLIW design might use an ...
متن کاملEfficient Fine Grained Synchronization Support Using Full/Empty Tagged Shared Memory and Cache Coherency
Performance results of machines with fine-grain synchronization on individual lock-free data items (e.g., words), such as the MIT Alewife multiprocessor, illustrate the benefits of supporting fine-grain synchronization. The performance benefits are primarily the result of allowing a dataflow style of computation in programming models, and maximizing the exposed parallelism by minimizing the pos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002